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Difference-Preserving  Codes 

Abstract: 

A  code  (of  integers  by  binary  sequences)  is  called  difference- 
preserving  (DP-code)  if  it  has  the  following  two  properties: 

1.  if  the  absolute  value  of  the  difference  between 
two  integers  is  less  than  or  equal  to  a  certain 
threshold,  the  Hamming  distance  of  their  code- 
words is  equal  to  this  value. 

2.  if  the  absolute  value  of  the  difference  between 
two  integers  exceeds  the  threshold,  then  the 
Hamming  distance  of  their  codewords  also  exceeds 
this  threshold. 

Such  codes  (or  slight  modifications  thereof)  have  also  been 

called  path-codes,  circuit-codes,  or  snake-in-the-box  codes.   This  paper 

discusses  the  application  of  DP-codes  to  pattern  recognition  and 

classification  problems,  and  presents  a  construction  of  efficient 

DP-codes  whose  information  content  is  asymptotically  (in  the  length 

of  codewords)  of  the  order  of  theoretical  upper  bounds. 

Key  Words  and  Phrases: 

coding,  difference-preserving  codes,  bounded-error  codes, 
path-codes,  snake-in-the-box  codes,  pattern  recognition  and  classification 

1.   Terminology,  definition,  and  problems 

Let  1,J  be  integers  and  u,v  binary  sequences  of  N  bits  each. 
Let  |i-j|  be  the  absolute  value  of  the  difference  between  i  and  j,  and 
let  H(u,v)  be  the  Hamming  distance  between  u  and  v  (i.e.  the  number  of 
positions  in  which  u  and  v  differ).   Let  t  *  1  be  an  integer  called  the 
threshold,  and  K  n  be  an  integer  called  the  range.   Let  &,   the  code, 
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be  a  mapping  from  the  set  (1,2,...,K)  into  the  set  {0,1}   of  binary- 
sequences  of  length  N. 

We  say  that  &   is  a  difference -pre serving  code  with  threshold  t, 
or  DP-t  code,  if  and  only  if,  for  all  integers  i,j  in  the  range  1  g  i  g  K, 

1  s  j  g  K,  the  following  two  conditions  hold: 

1)  |i-j|  ^  t=*H(fl(i),  *(j))  =  |i-j] 

2)  |i_j|  >  t=$R(&{±),   jD(j))  >  t. 

Intuitively,  the  code  £  preserves  the  difference  between 
integers,  whereby  all  differences  larger  than  t  are  lumped  together. 

When  we  want  to  specify  the  length  N  of  codewords  of  a  DP-t 
code,  we  may  speak  of  an  (N,t)-code,  and  if  in  addition  we  want  to 
specify  the  range,  we  may  speak  of  a  (K,N,t)-code. 

As  an  example,  the  following  is  an  (8,J|.,l)-code  (range  =  8, 
length  =  kf   threshold  =  l): 


integer 

codeword 

1 

0000 

2 

0001 

3 

0011 

h 

0111 

5 

0110 

6 

1110 

7 

1100 

8 

1101 

optimal  in  the  sense 

that  there  i 

range  larger  than  8. 

It  is  natural  to  ask  the  following  types  of  questions  about 

difference-preserving  codes: 

--  What  is  the  maximal  range  K  for  given  length  N  and  threshold  t 

—  What  is  the  minimal  length  N  for  given  range  K  and  threshold  t 

--  What  is  the  maximal  threshold  t  for  given  range  K  and  length  N 

--  How  can  one  systematically  construct  DP-codes  for  various 
choices  of  the  parameters  K,  N  and  t. 


-3- 

--  Are  there  efficient  encoding  and  decoding  algorithms  for 
various  DP-codes 

This  paper  will  answer  some  of  these  questions,  but  first 
let  us  motivate  our  interest  in  distance-preserving  codes,  and  survey 
the  known  results. 

2.   Motivation  and  applications 

Our  main  interest  in  difference-preserving  codes  stems  from 
their  use  in  pattern  recognition  and  classification  problems,  where  the 
following  technique  is  standard. 

With  an  object  A  one  associates  a  vector  (a  ,  a       a  ) 

>  2.   p  '  '  '  '  )    ^.p  /  j 

where  each  component  represents  a  feature  that  is  measured  by  an  integer 
value.   The  decision  whether  two  objects  A  and  B  are  equivalent  is  then 
based  on  whether  or  not  their  corresponding  feature  vectors  (a        a  ) 
and  (bx,  ...,  bf)  are  close  enough.   In  one  of  the  most  useful  metrics, 
this  amounts  to  deciding  whether  the  inequality 

f 

£   la.-b.  I  g  t 

1   "1     i     I 


i=l  x      x 


holds,  where  t  is  some  threshold. 

DP-codes  allow  this  decision  to  be  made  very  efficiently,  by 
replacing  arithmetic  operations  (difference,  absolute  value,  sum)  by 
boolean  operations  (exclusive-or  and  "population  count",  i.e.  the ■ number 
of  l's  in  a  binary  sequence).  Assume  that  a  feature  vector  (a  ,  . .    a  ) 
can  be  stored  in  a  single  memory  cell  a,  each  component  a.  having  been 
assigned  a  field  of  sufficient  length  to  hold  all  possible  values  of  this 
component.   If  these  values  are  represented  by  a  DP-t  code,  then  the 

f 

inequality   Z   |a.-b  |  g  t  holds  if  and  only  if  the  result  of  the 
i=l 
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bitwise  exclusive-or  of  memory  cells  a   and  p,  which  holds  the  vector 

(b  ,  . ..,  b  ),  has  at  most  t  l's.   On  a  computer  with  long  word- 
length  and  a  built-in  population  count  operation  (such  as  the  CDC- 
6000  machines),  this  coding  technique  can  speed  up  the  comparison  of 
feature  vectors  drastically.   Since  feature -vector  comparisons 
frequently  occur  in  inner  loops  of  pattern  classification  programs, 
we  believe  that  DP-codes  are  an  important  technique  in  pattern  recognition 
and  classification. 

From  the  point  of  view  of  this  application,  we  are  interested 
in  DP-codes  for  small  ranges  K,  and  thresholds  t  that  may  be  close  to  K. 
Typical  values  to  be  encoded  might  be  the  grey-levels  of  a  digitized 
picture,  the  number  of  characters  of  English  words,  or  the  number  of 
vertices  of  a  graph  that  is  abstracted  from  a  handwritten  character. 
In  all  of  these  cases  a  range  of  the  order  of  magnitude  of  10  is  likely 
to  suffice.   Such  small  DP-codes  can  be  constructed  by  ad-hoc  procedures, 
or  by  techniques  such  as  those  described  in  section  5. 

However,  DP-codes  are  also  interesting  from  the  point  of  view 
of  coding  theory,  since  they  have  common  aspects  with  two  well-known 
classes  of  codes: 

a)  Gray  codes,  which  are  characterized  as  the  special  case 
t  =  1  of  the  first  requirement  of  DP-codes: 

|i-j|  ^  t=»H(^(i),  Mi))   =  |i-dl 

b)  Error-correcting  codes,  which  share  with  DP-codes  the 
requirement  that  codewords  (all  in  the  case  of  error-correcting  codes, 
all  but  some  in  the  case  of  DP-codes)  are  at  a  certain  minimal  distance 
from  each  other. 
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Because  of  these  two  properties,  DP-codes  might  also  be 
Called  codes  of  bounded  error:   if  any  b  bits  (b  g  t)  in  a  codeword 
*(i)  of  a  DP-t  code  are  in  error,  the  resulting  binary  sequence  either 
is  not  a  codeword  at  all  (error-detection),  or  else  it  is  a  codeword 
&ti)   of  an  integer  j  with  |i-j|  =  b  g  t  (bounded  error). 

A  DP-t  code  can  also  provide  a  limited  form  of  error- 
correction  as  follows:   if  any  b  g  [t/2j  bits  in  a  codeword  *(i)  are 
in  error,  then  the  resulting  binary  sequence  can  be  decoded  as  an 
integer  j  such  that  |i-j|  g  2b  I  t. 

From  the  point  of  view  of  coding  theory,  one  is  interested  in 
the  asymptotic  properties  of  DP-codes  for  large  ranges.   We  address  this 
question  in  sections  J  and  k,   where  we  describe  an  efficient  class  of 
DP-codes,  and  compare  their  information  content  to  theoretical  upper 
bounds. 

It  is  this  aspect  of  bounding  or  detecting  errors  which  has 
motivated  most  of  the  early  work  on  DP-codes.   Kautz  [8]   introduced 
such  codes  in  the  special  case  t  =  1,  discussed  their  application  to 
analog-to-digital  conversion,  and  gave  upper  and  lower  bounds  for  the 
maximal  number  of  codewords  in  circuit  codes  (a  slight  modification  of 
the  DP-codes  discussed  here,  where  the  sequence  of  codewords  forms  a 
closed  path).   Vasil'ev  [13]  improved  on  these  lower  bounds  by  constructing 
circuit  codes  for  t  =  1  exploiting  the  connection  already  mentioned  above 
between  error-correcting  codes  and  DP-codes.   The  class  of  DP-t  codes 
described  in  Section  3  of  this  paper  results  from  a  construction  also 
based  on  the  mentioned  connection  between  the  two  families  of  codes. 
Our  construction  is  substantially  different  from  that  of  Vasil'ev  and 
holds  for  arbitrary  values  of  the  threshold  t. 
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Chien,  Freiman  and  Tang  [l]  describe  code-construction 
techniques  "based  on  the  idea  of  combining  two  small  codes  into  one 
larger  one,  for  arbitrary  threshold  values.   They  also  introduced  a 
new  technique  for  obtaining  good  upper  bounds  on  the  number  of  codewords. 

Further  code-construction  techniques  and  improved  bounds  are  • 
described  in  various  papers,  such  as  Singleton  [12],  Klee  [9  ,10] 
Danzer  and  Klee  [  2  ],  Douglas  [\  ,   5],  and  Wyner  [11^]. 

The  main  contribution  of  this  paper  is  a  new  class  of  DP-t 
codes  which  have  asymptotically  for  large  codeword- length  N  and  fixed 
threshold  t,  a  higher  information  content  than  any  of  the  known  DP-codes. 
We  also  describe  some  new  code-construction  techniques  suitable  for 
constructing  small  codes,  and  we  have  already  described  earlier  the 
application  of  DP-codes  for  the  fast  computation  of  the  distance  of 
vectors  as  it  is  used  in  pattern  classification  problems. 

3.   A  Class  of  DP-Codes  Based  on  Error-Correcting  Codes 

The  class  of  DP-codes  to  be  presented  in  this  section  is  based 
on  an  idea  which  we  now  informally  describe.   As  stated  before,  an 
(N.t)  DP-code  is  a  sequence  of  vertices  in  an  N-cube  path  so  that  each 
vertex  v  on  the  path  is  at  distance  greater  than  t  from  any  other  code 
vertex,  with  the  exception  of  those  vertices  corresponding  to  integers 
within  difference  t  from  the  integer  represented  by  v.   If  one  considers 
now  a  binary  t -error -correcting  code  C  of  some  length  N1,  each  code 
point  in  C'  is  at  distance  at  least  (2t+l>  from  any  other  codepoint  in 
C\   Thus  we  may  think  of  threading  all  of  the  points  of  fi1  with  an 
N'-cube  path  &    in  the  hope  to  obtain  a  (N',t)  DP-code  as  the  sequence 
of  vertices  of  this  path.   Although  this  construction  certainly  meets 
the  first  condition  of  a  DP-code  (see  section  l)  it  may  not  meet  the 
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second.   In  fact  the  N'-cube  path  just  mentioned  may  he  thought  of  as 
the  catenation  of  subpaths  each  of  which  is  contained  within  the 
decoding  set  of  a  code  point  veC*.   While  this  ensures  that  each  point 
veC'  be  at  a  Hamming  distance  greater  than  t  from  any  other  point  of 
the  N'-cube  path,  this  may  not  occur  for  points  towards  the  ends  of   • 
subpaths  within  decoding  sets.   To  obtain  the  necessary  distance,  we 
may  think  to  construct  a  code  &   as  a  subset  of  the  cartesian  product 
of  jB«  and  of  a  new  code  fl"  of  length  N".   The  function  of  fi"  is  to 
provide  the  necessary  distance  where  fl1  is  more  likely  to  violate  the 
second  condition  of  DP-codes.   Thus  the  resulting  (N'+N")-cube  path 
will  be  traced  on  the  coordinates  of  &"   when  the  coordinates  of  fi '  are 
safe,  and  vice  versa.   As  we  shall  now  show  more  formally,  such  (N'+N",t) 
DP-code  can  be  constructed. 

We  begin  by  constructing  the  code  fi ' .   An  important  property 
of  j9»  is  better  elucidated  by  resorting  to  the  polynomial  representation 
of  binary  sequences,  or  vectors;  in  other  words  with  a  vector 

f  =  (f   >    •••>    f  m   )   we  associate  the  polynomial  f(x)  =   E  f^x1. 

i=0 
We  also  introduce  some  necessary  nomenclature.   For  some  positive 

integer  N',  A^  denotes  the  algebra  of  polynomials  over  GF(2)  in  x 

modulo  (x   -  1).   For  f(x)  eA^,  the  weight  W[f(x)]  of  f(x)  is  the 

number  of  nonzero  coefficients  of  f(x)  and  W[fx(x)  +  f  (x)  ]  =  H(f  (x),f  (x)) 

is  the  Hamming  distance  between  f  (x)  and  f  (x). 

Let  C»  c  Ajj,  be  a  cyclic  binary  code  with  odd  actual 
minimum  distance  d  =  2t+l.   These  requirements  are  satisfied,  for 
example,  by  a  primitive  binary  BCH-code  (see,  e.g.  [11],  p.  282).   Let 
g(x)  be  the  generator  polynomial  of  C  and  m(x)  be  a  polynomial  which 
has  the  minimum  degree  among  the  minimum  weight  polynomials  in  C  (clearly 
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g(x)  =  m(x)  when  W[g(x)]  =  d).   We  define  s  =  2   -  deg[m(x)]  >  0  and 
consider  the  set  of  polynomials 

F  =   {f(x)|deg[f(x)j  <  s}. 

We  may  now  order  the  elements  of  F  to  form  a  sequence  f_(x),f..  (x), . . .  ,  . 

f    (x)  so  that  W[f.  (x)+f   (x)]  =  1  f or  i  =  0,1,. . . ,2S-1  (mod  2s). 
2S-1  X 

It  is  well-known  that  such  sequences  exist.   They  are  referred  to  as 
Gray  code  sequences  and  correspond  to  Hamiltonian  circuits  on  the  "binary 
s-cube.   Define  now 

vt(x)  =  fi(x)m(x). 

We  have  the  following  simple  result: 

Lemma  1:  The  set  V  =  {v. (x)|v. (x)  =  f . (x)m(x),f . (x)  £ F)  is  contained  in 
C,  all  the  v.  (x)'s  are  distinct  and  H(v.  (x),v.+  (x)  )  =  d  for 
i  =  0,1,...,2S-1  (mod  2S). 

Proof:    (i)  Since  each  v. (x)  is  a  multiple  of  a  codeword  m(x)  it  is 
also  a  multiple  of  g(x),  hence  it  belongs  to  C. 

(ii)  To  show  that  all  the  elements  of  V  are  distinct,  assume 
v.  (x)  =  v.(x),  for  i  j-   j.   Letting  m(x)  =  p(x)g(x),  we  have 
deg[m(x)]  =  deg[p(x)]  +  deg[g(x)].   The  equality  v  (x)  =  v  (x)  can  be 

-J-  J 

rewritten  as  (f. (x)  +  f . (x) )p(x)g(x)  =  0,  i.e.,  (f  (x)  +  f  (x))p(x) 
i       0  ■*■       J 

must  be  a  multiple  of  h(x)  =  (xN  +  l)/g(x).   But  deg  [(f . (x)+f . (x))p(x) ] 

■*■     J 

deg[p(x)]  +  deg[f.  (x)  +  f.(x)]  <  deg[p(x)]  +  s,  whereas  deg[h(x)]  = 
2N  -  (deg[g(x)]  +  deg[p(x)])  +  deg[p(x)]"=  s  +  degjjp(x)].   Thus  h(x) 
cannot  divide  (f . (x)  +  f.(x))p(x). 


(iii)  Recall  that  dCv^x),  vi+1(x))  =W[(f.(x)  +  fi+1(x))m(x)  ]; 

(x)  +  fi+1(x)  =  x5  for 
H(vi(x),vi+1(x))  =W[m(x)]  =  d. 


since  f . (x)  +  f.  ,  (x)  =  x^  for  some  p  in  the  range   (0,s-l),  then 
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In  the  sequence  f  U),...,f  (x),  in  passing  from  f .  (x)  to 

P  _i  •*- 

fi+1(x)  there  is  exactly  one  coefficient  that  changes:  Let  it  be  the 

s(i) 
coefficient  of  x   J,   where  s(i)  e  {0/1,. . .  ,s-l}.   We  denote  the  pair 

(vi,v   )  as  a  g(i) -transition. 

We  now  construct  a  code  fl'  as  a  sequence  of  N ' -dimensional  * 
points  as  follows: 

1.  Let  x  ,x  ,...,x       be  the  powers  of  x  whose 
coefficients  are  nonzero  in  the  polynomial  m(x).   Construct  the 

N'-cube  path  V\  as  the  sequence  of  points  v.  =  v',v',...,v'  ,   where 

t      -,   ,  i.:+5(i) 

v    and  v!  differ  exactly  in  the  coefficient  of  x  J      for 
J -       J 

j  =  1,2,..., d-1. 

2.  fi'  is  the  N'-cube  path  obtained  by  catenating  the  paths 

This  code  fl '  has  an  interesting  property.   Since  all  points  at 
Hamming  distance  less  than  or  equal  to  (d-l)/2  from  a  point  in  C  belong 
to  distinct  cosets  of  the  vector  subspace  C,  it  is  clear  that  the  points 
in  a  path  segment  V  are  in  different  cosets  and  that  a  unique  collection 
of  (d-1)  cosets  is  associated  with  each  of  the  s  transition  types. 

We  now  consider  the  construction  of  the  code  &".      We  begin 
by  introducing  a  code  C"  as  a  set  of  at  least  s  N"-dimensional  points 
(as  many  as  the  coefficients  of  the  ff^aO's)  with  minimum  Hamming 
distance  (d-l)/2.   Denoting  by  u  an  arbitrary  element  of  C",  we  now 
define  an  infective  mapping  cp:   [0,1,. . .  ,s-l)  ■+  C",  so  that  u  =  cp(r) 
means  that  u  is  associated  with  the  r-transition.   If  u.  and  u  are  two 
points  in  C"  at  Hamming  distance  q,  the  sequence  of  the  first  q 
elements  of  the  N"-cube  path  u  =  u'  u',...,u'  =  u.  is  denoted  as 
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U  .   We  define  a  code  &"   as  the  set  of  points  on  the  paths  U   for 
ij  J 

all  u.  and  u.  in  cp{0,l,. . .  ,s-l}. 

We  have  thus  developed  the  necessary  nomenclature  for  the 
construction  of  a  DP-code  &   c  fi'  x  fi".   A  code  word  w  of  &  is  of  the 
form  [v,u],  where  ve&'  and  u  e  jB".   The  code  &,   which  is  a  mapping  fi:  . 
{1,...,K}  -»  {0,1}N,  is  constructed  as  follows: 

1.  For  each  v.  e  V,  let  u.  =  cpCs(i)  ]•   Let  H(ui_1,ui)  =  q±. 
We  construct  the  two  sequences  v.U.     and  V _.u  (i.e.  for  example, 

X   X  ~X •  X  XX 

v  U      is  the  sequence  U.  ,  .  to  each  element  of  which  we  juxtapose  v 
i  i-l,i  1-1,1 

on  the  left).   Note  that  v.U.    .  contains  a.  elements  and  that  V^ 

X  X  "X,  •  X  x 

contains  d  elements.  We  then  define: 

¥i  =  (Vi.i,i>(Vi> 

as  the  catenation  of  v.U.  .  ,  and  V  u.  (v  U     preceding  V.U.). 

1   1-1,1  X  ±  J-   X— J-,_L  -*-   -1- 

2.  The  code  &  is  the  catenation  of  the  paths  Wn,Wn,...,W    , 

U   -1      2S-1 

from  which  exactly  (d-l)/2  consecutive  points  have  been  removed. 

Notice  that  WQ  is  defined  as  (vQU_1  q)^^),  where  u_x  =  ^(2  -l)], 
since  the  sequence  f_(x),...,f  a  (x)  is  a  Hamiltonian  circuit  in  the 
binary  s-cube. 

We  now  claim: 
Theorem  1:   The  code  &   is  an  (N'+N", (d-l)/2)  DP-code. 

Proof:  Let  w^  =  [v^1  V1)  ]  and  w^  =  [v^V2']  be  two  codewords  of  & . 
We  first  consider  the  first  requirement.   If  w    and  w  c      belong  to 
the  same  W.,  then  H(w(l),w(2))  =  d(^1(w(l)),  fi_1(w(2)))  for 
H(w^^,  w^)  ts  d-1  +  (d-l)/2  and  the  requirement  is  met.   The  same 
happens  when  w  and  Wg  belong  to  two  consecutive  W^s. 

We  must  now  show  that  if  d(j8"1(w^'1|),il"  (w^  ')  >   (d-l)/2, 
then  also  H^^w^ )  >  (d-l)/2.   Notice  that: 
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and  consider  the  following  cases: 

1.   v    eC   If  v    =  v  c~   ,   the  condition  is  obviously  met 
since  both  w(1J  and  w(2)  belong  to  the  same  W.  sequence.   Suppose 
V(D^V(2).  ifv(2)£C,   thenH^(v(iy2))idi   Ifv^^then 
it  belongs  to  the  sphere  of  radius  (d-l)/2  centered  on  some  other 
codepoint  v*ec;  it  follows  that  H  >   H(v(l),v*)  -  H(v*,v(2))  i  a  -  (d-l)/2 
(d-l)/2  +  1  >  (d-l)/2. 

2-   v    t   C,  v(2)  /  C.   Assume  at  first  that  u(l)  =  u(2), 
i.e.,  H  =H(v(l^v(2)).   In  this  case  v(l>  €  V(l)  andv^V^,  where 

V  and  V    are  paths  pertaining  to  a  transition  of  the  same  type. 
Then  if  v(1)  and  v(2)  belong  to  the  same  coset,  H(v(l),v(2))  §  d  since 
each  coset  of  C  has  the  same  distance  structure  as  C.   Suppose  that 

v    and  v(   belong  to  different  cosets  of  C,  i.e.,  v^  ec^^  and 

(2)    (2) 
v    e C   .   Then  let  v±   and  vi+1  be  the  elements  of  c '  which  delimit 

V  and  v*  be  the  element  of  V(l)  in  C(2);  we  can  write 

H(VV  X   )  +  H(v(l),v*)  +  H(v*,vi+1)  =  d  by  the  construction  of  the 

code  &'.      Now,  if  H(v(l),v*)  £  (d-l)/2,  then  H  g  H(v(2),v*-)  -  H(v*.v(l)) 
■  d  -  (d-l)/2;  the  same  holds  otherwise,  due  to  the  distance  property 
of  C'. 

Finally  if  V    and  V^    '   pertain  to  different  types  of 
transitions,  then  H(u(1),  vS2>)   g  (d-l)/2  by  the  property  of  &  "  and 
H(v   ,v  '"  )  1  1,  since  v^   and  v^  '   belong  to  the  decoding  subsets 
of  two  distinct  points  of  C.   Hence,  H  l  (d-l)/2  +  1  >  (d-l)/2.   ■ 

We  now  evaluate  the  efficiency  of  the  DP-code  &    just  constructed. 
The  number  K  of  codewords  in  &    is  (d-l)/2  less  than  the  sum  of  the  numbers 
of  codewords  in  each  segment  W.  (i  =  0,...,2s-l).   Denoting  by  |w. |  the 
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cardinality  of  W.  we  have 

1=0   x'    2 

With  the  same  notation  we  obtain  |¥^|  =  |v\J  +    \v±\.     While  \v±\    =  d 
for  every  i,  |u. |  =  q.  depends  upon  i.   Thus  K  =  2  d  -  (d-l)/2  +  Sq  .  The 
value  of  Zq.  depends  clearly  upon  the  mapping  cp,  I.e.,   upon  the  assign- 
ment of  vectors  ueC"  to  transitions. 

The  selection  of  this  mapping  is  very  relevant  to  establishing 
the  value  of  K  and,  as  we  shall  see  later,  to  the  encoding  and  decoding 
algorithms.   To  gain  some  insight  into  this  problem,  we  must  refer  to  the 

choice  of  the  sequence  fn(x),...,f  ■   (x).   This  sequence  of  s-dimensional 

d 

vectors  is  equivalent  to  a  sequence  T  over  the  integers  {0,1,. . . ,s-l} , 

s 

such  that,  if  the  i-th  element  of  Tg  is  r,  we  have  fi_1(x)  +  f^x)  =  x  . 

One  possible  choice  of  T  is  provided  by  the  standard  Gray  code  sequence, 

s 


which  is  defined  by 

r 


T1  =  0 


/  T  =  T^j-l)^   for  j  =  2,...,s-l 

T  =  T  .,  (s-l)T  ,  (s-1) 
s    s-1      s-1 


Hereafter  we  shall  only  consider  such  standard  Gray  code  sequences. 
With  this  restriction,  consecutive  pairs  in  T  are  either  of  type  (0,j) 
or  of  type  (j,0)  (j  =  l,...,s-l).   Furthermore,  denoting  by  v.  the  total 
multiplicity  of  pairs  (<j,0)  and  (0,j),  we  readily  obtain  vg_1  =  k-   and 
V.  =  2S_J  for  j  =  l,2,...,s-2.   If  we  now  define  as  d.  the  Hamming 
distance  between  cp(0)  and  cp(j)  in  the  code  C",  then  clearly 


2s  -1      s-1        s-1 
(a)  Z  q.  =   E  v.d.  =   Z  2S"Jd.  +  2d  ... 

i^o"1   j-i  ^       j=i    J    S"1 
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Due  to  the  fact  that  C"  has  minimum  distance  at  least  (d-l)/2,  then 

d  ^  (d-l)/2  for  every  j   and  we  obtain  the  lower  hound  on  K 
J 

K  ^  2sd-  %i  +(d-l)2s-1  =  (3d-l)2s-1  -  %i 


Of  course,  the  most  interesting  cases  occur  when  s  coincides  with  the 
number  of  information  bits  in  the  primitive  BCH  code  C '  of  length 
N1  =  2  -1.   Since  C  has  at  most  n(d-l)/2  parity  digits  (see  [11],  p.  272), 
we  obtain 

n 


Ob)  K  *    (3t+l)22  _1-nt  _  t, 

where  use  has  been  made  of  the  equality  t  =  (d-l)/2. 

The  actual  number  N  of  bits  required  by  a  DP-t  code  &  depends 
on  the  selection  of  the  parameters  n,t  and  of  the  auxiliary  code  c". 

Therefore  we  shall  only  attempt  to  compute  a  reasonably  accurate  upper- 
bound  to  attainable  values  of  N.   The  Varshamov-Gilbert  bound  (see, 
e.g.  [6],  p.  537)  ensures  us  that  there  is  an  error-correcting  code 
C"  of  length  N"  with  minimum  distance  t  and  s  codewords  if  the  following 
inequality  holds: 

t-1 


(c) 


N" 


N"  "  H2      .E   (i  )  *  %2S- 


i  =  0 


This  inequality  is  certainly  satisfied  if  we  choose  N"  so  that 

11  t-1   „ 

N  -  R  ^  %2s  for  some  R  ^  %2  Z   (r  )  .   In  addition,  we  confine 


ourselves  to  N"  £  J^(t-l);  then  (see  [6],  p.  530)  we  can  use  the  well- 
known  inequality 

%2  t      (f )  *  N"H(^) 


-ll^- 
where  H(z)  =  -z%„z  -  (l-z)fofr2(l-z)  is  the  binary  entropy  function. 
Moreover,  for  (t-l)/N"  ^  l/k   we  have 

H(^)  S  16A  tfi   (1  - .!$)   +  Wrf  1 

where  A  =  l-H(l/!+)  =  0.189.    Thus  we  conclude  that  inequality  (b)  is  • 
satisfied  if 

N"  -  l6A(t-l)(l  -  ^r)  -  (l-U)N"  -  %2s  i?  0, 

which  in  turn  is  satisifed  if  we  choose 

N"  =  i)-(t-l)  +  «= 

iJ.[l-H(lA)] 

Therefore,  if  we  assume  s  =  2n-l  -  nt,  the  length  N  is  upper-bounded  by 
(d)  N  <  2n-l  +  1.32n  +  »4(t-l)  ; 

and  the  code  redundancy  is  upper  bounded  by  n(t  +  1.32)  + 
l+(t-l)-  foj,  (jt+l).   Thus,  asymptotically  for  large  codeword  length 
N,  the  number  of  bits  of  redundancy  in  such  a  DP-t  code  is  of  order 
no  more  than  (t+1..32)%2N. 

Returning  to  expression  (a),  we  notice  that,  with  the  assumption 

that  the  sequence  £.(x),...,f    (x)  be  a  standard  Gray  code,  we  still 

^         2S-1 

have  the  freedom  of  selecting  the  mapping  cp(i)  so  that  desirable 
features  be  exhibited  by  the  code  &.      If  we  want  to  maximize  the 
number  K  of  points,  we  should  choose  di   ^  d2  S  . . .  1  ds_i:  "this  is 
realized  by  choosing  cp(l)  at  the  largest  distance  from  cp(o)  and 
successively  selecting  each  cp(i)  as  that  point  which  is  the  remotest 
from  cp(o)  and  has  not  yet  been  used. 

This  strategy  for  selecting  cp,  however,  presents  the  drawback 
of  complicating  the  encoding  j  -»-*(j).  In  fact,  since  the  subsequences 
W.  have  different  lengths,  it  appears  necessary  to  perform  a  costly 
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table  look-up  in  order  to  decide  to  which  such  sequence  the  point  fi(j) 

belongs.   This  difficulty,  of  course,  disappears  if  |w. |  is  a  constant 

independent  of  i.   We  can  obtain  this  property,  at  some  expense  in 

redundancy,  by  selecting  all  the  cp(i)'s  (i  =  l,...,s-l)  at  constant 

largest  distance  b  from  cp(0).   With  this  choice,  we  reaily  obtain  the  • 

following  encoding  and  decoding  algorithms  (note  that  decoding  here 

does  not  imply  any  error-correction),  both  of  which  assume  |v  I  =  d 

d  i  i1    ' 

\XS±\    =  b  and  m(x)  =   Z  x  m. 

m=l 

Encoding  algorithm  i  ->  Mi) 

Let  the  integer  i  be  given. 

1.  Express  i  as  i  =  p(d+b)  +  r  where  r  <  d  +  b. 

2.  Express  p  by  its  binary  Gray  code  representation  f  (x) 

P 

h   i  +s(p) 

3.  If  r-b  =  h  >  0,  compute  v(x)  =  f  (x)m(x)  +  Z,  x  m 

p      y   m=l        > 

set  u(x)  =  cp[5(p)J  and  go  to  step  5.   Else,  proceed. 

k.      Compute  v(x)  =  fp(x)m(x)  and  u(x)  as  the  (r+l)st  element 

on  the  path  U  , 

p-l,p 

5.  ^(i)  =  (v(x),u(x)). 

Decoding  Algorithm  w  ->  j9"1(w) 

Let  the  codeword  w  =  (v(x),u(x))  be  given. 

1.   Express  v(x)  as  v(x)  =  f  (x)m(x)  +  r(x)  where 
deg[r(x)]  <  deg[m(x)]. 

2a.   If  r(x)  =  0,  p  is  the  integer  whose  Gray  code  representation 
is  f  (x).   Then,  u(x)  belongs  to  the  path  U      and  let  it  be  the 
(r+l)st  element  on  this  path.   Then  $-1(w)  =  p(d+b)  +  r. 

2b.   If  r(x)  ^  0,  inspect  u(x)  and  determine  6  =  q>_1(u) 
(table  look-up).   Then  determine,  by  performing  at  most  d  trials,  the 
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integer  r  for  which 

r+1  i  +5  A 
v(x)  +   Z  x  m   =  v*(x) 
m=l 

is  a  multiple  of  m(x).   Express  v*(x)  as  fp(x)m(x).   Then 
^-1(w)  =  p(d+h)  +  d  +  r. 

Before  closing  this  section,  we  illustrate  the  technique  just 
presented  "by  the  following  examples. 

Examples  (l)  A  (71,9,1) -code  can  he  constructed  as  follows.  We  choose 
the  (7,10  single-error-correcting  Hamming  code  as  Cl  The  resulting  code 
fi'  has  parameters  N«  =  7,  a  =  h  and  d  =  3.  The  code  C"  can  he  chosen  as 
cp(0)  =  00,  cp(l)  =  11,  cp(2)  =  10,  q>(5)  =  01  (i.e.,  N"  =  2).   Since 

d  =  d  =  1  and  d_=  2,  using  expression  (a)  we  ohtain 
2    3        ^L 

K  =  2^.3  -  1  +  (25.2  +  22.1  +  221)  =  71 

(2)  Consider  the  (15,7)  BCH  double-error-correcting  code  as 
the  code  C1  for  the  construction  of  £ '  of  parameters  N'  =  15,  s  =  7 , 

d  =  5.  We  must  also  choose  a  code  C"  having  at  least  7  codewords  at 
minimum  distance  (5-l)/2  =2:  one  such  code  is  the  set  of  the  even 
weight  binary  sequences  of  length  1*.   Then  we  can  select  cp(0)  =  0000, 
cp(l)  =  1111,  and  cp(2),...,cp(6)  as  weight  2  codewords.   From  (a)  we 
readily  obtain 

K  =  27.  5  -  2  +  376  =  1011^ 

thus,  we  have  a  (lOllj.,  19,2) -code. 

(3)  Finally  we  construct  a  DP-3  coae.   Consiaer  the  famous 
(23,12)  triple-error-correcting  Golay  coae  as  the  C  coae.  We  obtain 

~+~,^  t\t«  -  9^     <5  -  12  d  =  7.   The  coae  C"  must  have 
a  coae  &  '  with  parameters  JN  =  0>  s  -  x^>  a   '• 
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at  least  12  codewords  with  minimum  distance  3;  then  c"  can  be  chosen  as 
the  (7 ,k)   single-error-correcting  code.   We  select  the  mapping  cp  so  that 
¥[cp(0)]  =  0,  W[cp(l)]  =  7,  W[cp(i)]  =  k   for  i  =  2-8,  W[cp(j)]  =  3  for 
j  =  9-11-   We  obtain 

K  =  212  .7  -  3  +  22512  =  51181   . 

that  is,  a  (5II8I, 30,3) -code.   If  instead  we  choose  an  (8,i+)  Hamming 
code  as  C",  this  allows  the  following  selection  of  cp:  W[ep(o)]  =  0, 
W[cp(i)]  =  k   for  every  i  ^  0.   This  results  in 

K  =  212.7  -  3  +  (212  +  \).k  =   1+5057 

that  is,  a  (lj.5057,31,3) -code  which  is  very  simple  to  encode  and  decode. 

In  the  next  section  we  shall  compare  the  efficiency  of  the 
DP-codes  just  described  against  upper-bounds  on  the  attainable 
efficiency. 

k»      Bounds  on  the  Information  Content  of  DP-codes 

To  evaluate  the  information  content  of  DP-codes,  several  authors 
have  referred  to  the  number  K(N,t)  of  codewords  in  an  (N,t)-code.   These 
bounds  have  usually  been  presented  for  circuit  codes  (i.e.  when  the 
codeword  sequence  forms  a  closed  path)  but  asymptotically  they  have  the 
same  behavior  as  the  corresponding  bounds  for  path-codes,  i.e.  our 
DP-codes. 

Upper  bounds  to  K(N,t)  are  based  on  geometric  arguments  [  1,5,7] 
and  are  reminiscent  of  the  Hamming  bound  for  error-correcting  codes. 
Except  for  rather  small  values  of  N  and  t,  the  discrepancy  between  the 
bounds  and  the  best  codes  discovered  is  substantial.   The  upper-bounds, 
however,  are  interesting  in  their  asymptotic  behavior,  because  they 
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provide  a  useful  guideline  for  evaluating  the  efficiency  of  various 
code  constructions.   For  given  threshold  t,  the  best  upper  bound 
known  is  due  to  Douglas  [  5 ];  it  represents  a  small  refinement  of  a 
result  obtained  by  Chien,  Freiman,  and  Tang  [  1  ].   This  bound  is  of  the 

form 

N[l-H(^)]  +  o(N) 
(e)  K(N,t)  <  2 

where  H(  )  is  the  entropy  function.  A  variant  of  this  bound,  presented 
by  Wyner  [Ik],   gives  asymptotically  a  substantial  improvement  over  (e) 
when  t  is  a  constant  fraction  of  N.   The  bound  expressed  by  (e)  indicates 
that  at  least  NH[(t-l)/2N]  bits  of  redundancy  are  needed  by  any 

(N,t)-code. 

The  constructions  of  DP-codes  proposed  by  various  authors 
provide  lower  bounds  to  K(N,t).  With  one  notable  exception,  there  is 
a  substantial  gap  between  upper  and  lower  bounds.   In  fact  typical 
lower  bounds  to  K(N,t)  for  t  l  2  are  of  the  form  (see  [  1  ],  [12]) 


(f) 


K(N,t)  ^  22N/t  (for  t  §  2) 


whereas  f or  t  =  1  Vasil'ev  [13]  and  Danzer  and  Klee  [2  ]  were  able  to 

obtain  .  . 

N-%:_N+o(N) 

K(N,1)  ^2 

Let  us  now  consider  the  DP-codes  constructed  in  Section  3. 
From  relations  (b)  and  (d)  we  obtain 

K>         22n-l-nt  m   2N-(t+1.32)to»2N+o(N) 
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which  shows  that  their  redundancy  is  at  most  of  order  (t+1.32)%  N. 
Hence  the  codes  described  in  Section  3  are  comparable  in  efficiency 
to  the  codes  constructed  by  Vasil'ev  and  Danzer  and  Klee,  f or  t  =  1 
while  for  t  ^  2  they  are  superior  to  previously  known  codes.   Further- 
more, for  fixed  t  and  asymptotically  in  N,  their  efficiency  is  of  the  ► 
same  order  as  that  prescribed  by  the  upper  bounds. 

5.   Simple  code-composition  techniques 

The  class  of  codes  presented  in  section  3  is  very  efficient 
for  large  ranges.   For  short  codes  there  is  a  certain  disadvantage  in 
that  not  all  codeword  lengths  can  be  selected.   This  is  due  to  the  fact 
that  the  construction  of  jQ<  must  be  based  on  existing  binary  cyclic 
codes.   Thus  there  may  be  substantial  gaps  between  realizable  word 
lengths  of  codes  in  that  class. 

For  the  application  to  pattern  classification  problems,  where 
small  ranges  are  usually  needed,  other  code  construction  techniques  may 
be  more  appropriate,  and  so  we  describe  two  of  these.   The  efficiency 
of  codes  constructed  according  to  these  methods,  however,  compares 
unfavorably  to  that  of  the  codes  of  section  3  for  large  ranges:   the 
redundancy  is  typically  proportional  to  the  codeword  length  N,  as  opposed 
to  t%  N  for  the  codes  of  section  3. 

For  small  values  of  N  (say  W  g  5)  and  t  it  is  fairly  easy  to 
construct  optimal  codes  (i.e.  those  of  maximal  range)  by  hand.   These 
may  be  used  as  building  blocks  for  the  composition  techniques  described 

in  this  section.   The  table  below  gives  the  value  of  K   (N,t)  for  N 

max  '  ' 

from  1  to  6,  and  t  from  1  to  5. 
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Table:   Ranges  K   (N,t)  for  optimal  DP-codes. 

IllclX 


\N 
t\ 

1 

2 

3 

k 

5 

6 

1 

2 

3 

5 

8 

34 

25 

2 

2 

3 

k 

6 

8 

11)- 

3 

2 

3 

k 

5 

7 

9 

k. 

2 

3 

k 

5 

6 

8 

5 

2 

3 

h 

5 

6 

7 

Notice  that  for  t  §  N  -  1,  we  have  K   (N,t)  =  N+l,  an  optimal 
(N,t)-code  "being  any  shortest  path  in  the  N-ciibe  that  joins  two 
vertices  that  are  at  maximal  distance  N  from  each  other.   For  t  =  N  -  2, 
K   (N,t)  =  N  +  2,  an  optimal  code  "being  a  shortest  path  between  opposite 
vertices  of  the  N-cube,  augmented  by  one  more  codeword.   For  t  <  N  -  2, 
there  appears  to  be  no  simple  description  of  optimal  (N,t) -codes. 

The  following  two  techniques,  judiciously  used,  allow  one  to 
construct  reasonably  efficient  DP-codes  for  any  small  values  of  N  and  t. 
Both  are  based  on  the  idea  of  combining  an  (NT,  t')-code  with  an 
(N",t")-code  to  form  an  (N,t)-code,  each  of  whose  codewords  w  is  the 
juxtaposition  of  a  codeword  u  of  the  first  code  and  a  codeword  v  of  the 
second  code.   It  is  a  simple  exercise  to  verify  that  each  of  these 
techniques  yields  a  DP-code  with  parameters  as  stated. 

a)  Threshold  addition  technique. 

Let  vl9   u  ,   ...,  u  be  an  (N'^t1  )-code,  and  v,,  v  ,  . ..,  v 

be   an   (N",t")-code,   with  p  i  q.      Then  if  p  <  q,  ilv.,   ui_Yo>  uoYo>  upvV 
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u3vy   uy\>    '">   upvp>   upvp+i  is   an   (N'+N",    t'+t")-code   of  range 
K  =  2p.      If  p  =  q,  vl  v         does  not  exist,   and  K  =  2p   -  1. 

to)  Range  extension  technique. 

Let  u^   u2,   ...,  up  toe  an   (N',t)-code,   and  v±9   v  ,....,   v     toe 
an   (N",t)-code.      Denote  toy  V  the   sequence  v  v  . .  .v  ,   toy  V  the   sequence 
VqVq-l*"Vl>  by  uiV  the   sec3.uence  ototained  toy  juxtaposing  to  each 
element  of  V  the  codeword  u.   on  the  left-   and  similarly  for  u  V 


\(t+l)+lY>   Uh(t+1)+2V    "'   U(h+l)(t+l)Vq      (h  even) 


«h  =  s 


>(t+l)+lV>   Vb*l)+2V1'    —'   U(h+l)(t+l)Vl      (h  odd) 


The   sequence  WQ,  V^,    ...,  wrp/(t+1)i    is   an   (N*+N",t)-code  of  range 
K  =  [p/(t+l)"|q  +   t[p/(t+l)  J. 

Example 

In  order  to  construct  a  (6, 3) -code,  we  may  compose  it  by  means 
of  the  threshold  addition  technique  from  a  (3,l)-code  and  a  (3,2)-code. 
Optimal  codes  for  these  parameters  are  readily  found: 


(5,l)-code:  K  =  3 


(3,2)-code:  K   =  k 


000 
on  I 
Oil 
111 
100 


000 
001 

011 

111 
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Composed  (6, 3) -code:  K  =  8 


000  000 

000  001 

001  001 
001  Oil 
Oil  Oil 
Oil  111 
111  111 
111  100 


This  compares  rather  favorably  with  the  range  K  =  9  of  an  optimal  (6, 5) -code. 
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